High-Dimensional Statistical Learning: Roots, Justifications, and Potential Machineries
نویسنده
چکیده
High-dimensional data generally refer to data in which the number of variables is larger than the sample size. Analyzing such datasets poses great challenges for classical statistical learning because the finite-sample performance of methods developed within classical statistical learning does not live up to classical asymptotic premises in which the sample size unboundedly grows for a fixed dimensionality of observations. Much work has been done in developing mathematical-statistical techniques for analyzing high-dimensional data. Despite remarkable progress in this field, many practitioners still utilize classical methods for analyzing such datasets. This state of affairs can be attributed, in part, to a lack of knowledge and, in part, to the ready-to-use computational and statistical software packages that are well developed for classical techniques. Moreover, many scientists working in a specific field of high-dimensional statistical learning are either not aware of other existing machineries in the field or are not willing to try them out. The primary goal in this work is to bring together various machineries of high-dimensional analysis, give an overview of the important results, and present the operating conditions upon which they are grounded. When appropriate, readers are referred to relevant review articles for more information on a specific subject.
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملSVMs—a practical consequence of learning theory
Bernhard Schölkopf, GMD First Is there anything worthwhile to learn about the new SVM algorithm, or does it fall into the category of “yet-another-algorithm,” in which case readers should stop here and save their time for something more useful? In this short overview, I will try to argue that studying support-vector learning is very useful in two respects. First, it is quite satisfying from a t...
متن کاملÓ Ú Ö Ó Û Ü 1⁄2 Ý »
Is there anything worthwhile to learn about the new SVM algorithm, or does it fall into the category of “yet-another-algorithm,” in which case readers should stop here and save their time for something more useful? In this short overview, I will try to argue that studying support-vector learning is very useful in two respects. First, it is quite satisfying from a theoretical point of view: SV l...
متن کاملThe Effect of Three-Dimensional Model on Anatomy Learning of Middle Ear
Purpose: The aim of this study was to study the effect of three-dimensional model in learning the anatomy of middle ear. Materials and Methods: The study was conducted at Artesh University of Medical Sciences in 3 phases in 2007: 1- preparation of three-dimensional model with reference to the Gray's Anatomy for Students (2005-1st edition), 2- dividing medical and nursing students into 4 grouos ...
متن کاملOn the Consistency of Bayesian Variable Selection for High Dimensional Binary Regression and Classification
Modern data mining and bioinformatics have presented an important playground for statistical learning techniques, where the number of input variables is possibly much larger than the sample size of the training data. In supervised learning, logistic regression or probit regression can be used to model a binary output and form perceptron classification rules based on Bayesian inference. We use a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 14 شماره
صفحات -
تاریخ انتشار 2015